127 research outputs found

    Computing an Evolutionary Ordering is Hard

    Get PDF
    We prove that computing an evolutionary ordering of a family of sets, i.e. an ordering where each set intersects with --but is not included in-- the union earlier sets, is NP-hard

    Pancake Flipping is Hard

    Get PDF
    Pancake Flipping is the problem of sorting a stack of pancakes of different sizes (that is, a permutation), when the only allowed operation is to insert a spatula anywhere in the stack and to flip the pancakes above it (that is, to perform a prefix reversal). In the burnt variant, one side of each pancake is marked as burnt, and it is required to finish with all pancakes having the burnt side down. Computing the optimal scenario for any stack of pancakes and determining the worst-case stack for any stack size have been challenges over more than three decades. Beyond being an intriguing combinatorial problem in itself, it also yields applications, e.g. in parallel computing and computational biology. In this paper, we show that the Pancake Flipping problem, in its original (unburnt) variant, is NP-hard, thus answering the long-standing question of its computational complexity.Comment: Corrected reference

    Disorders and Permutations

    Get PDF

    The tree-child network problem and the shortest common supersequences for permutations are NP-hard

    Full text link
    Reconstructing phylogenetic networks presents a significant and complex challenge within the fields of phylogenetics and genome evolution. One strategy for reconstruction of phylogenetic networks is to solve the phylogenetic network problem, which involves inferring phylogenetic trees first and subsequently computing the smallest phylogenetic network that displays all the trees. This approach capitalizes on exceptional tools available for inferring phylogenetic trees from biomolecular sequences. Since the vast space of phylogenetic networks poses difficulties in obtaining comprehensive sampling, the researchers switch their attention to inferring tree-child networks from multiple phylogenetic trees, where in a tree-child network each non-leaf node must have at least one child that is a tree node (i.e. indegree-one node). We prove that the tree-child network problem for multiple trees remains NP-hard by a reduction from the shortest common supersequnece problem for permuations and proving that the latter is NP-hard.Comment: 3 figures and 11 page

    The Complexity of Finding Effectors

    Full text link
    The NP-hard EFFECTORS problem on directed graphs is motivated by applications in network mining, particularly concerning the analysis of probabilistic information-propagation processes in social networks. In the corresponding model the arcs carry probabilities and there is a probabilistic diffusion process activating nodes by neighboring activated nodes with probabilities as specified by the arcs. The point is to explain a given network activation state as well as possible by using a minimum number of "effector nodes"; these are selected before the activation process starts. We correct, complement, and extend previous work from the data mining community by a more thorough computational complexity analysis of EFFECTORS, identifying both tractable and intractable cases. To this end, we also exploit a parameterization measuring the "degree of randomness" (the number of "really" probabilistic arcs) which might prove useful for analyzing other probabilistic network diffusion problems as well.Comment: 28 page

    Consensus Strings with Small Maximum Distance and Small Distance Sum

    Get PDF
    The parameterised complexity of consensus string problems (Closest String, Closest Substring, Closest String with Outliers) is investigated in a more general setting, i. e., with a bound on the maximum Hamming distance and a bound on the sum of Hamming distances between solution and input strings. We completely settle the parameterised complexity of these generalised variants of Closest String and Closest Substring, and partly for Closest String with Outliers; in addition, we answer some open questions from the literature regarding the classical problem variants with only one distance bound. Finally, we investigate the question of polynomial kernels and respective lower bounds

    Decomposing Cubic Graphs into Connected Subgraphs of Size Three

    Get PDF
    Let S={K1,3,K3,P4}S=\{K_{1,3},K_3,P_4\} be the set of connected graphs of size 3. We study the problem of partitioning the edge set of a graph GG into graphs taken from any non-empty S′⊆SS'\subseteq S. The problem is known to be NP-complete for any possible choice of S′S' in general graphs. In this paper, we assume that the input graph is cubic, and study the computational complexity of the problem of partitioning its edge set for any choice of S′S'. We identify all polynomial and NP-complete problems in that setting, and give graph-theoretic characterisations of S′S'-decomposable cubic graphs in some cases.Comment: to appear in the proceedings of COCOON 201

    Beyond Adjacency Maximization: Scaffold Filling for New String Distances

    Get PDF
    International audienceIn Genomic Scaffold Filling, one aims at polishing in silico a draft genome, called scaffold. The scaffold is given in the form of an ordered set of gene sequences, called contigs. This is done by confronting the scaffold to an already complete reference genome from a close species. More precisely, given a scaffold S, a reference genome G and a score function f () between two genomes, the aim is to complete S by adding the missing genes from G so that the obtained complete genome S * optimizes f (S * , G). In this paper, we extend a model of Jiang et al. [CPM 2016] (i) by allowing the insertions of strings instead of single characters (i.e., some groups of genes may be forced to be inserted together) and (ii) by considering two alternative score functions: the first generalizes the notion of common adjacencies by maximizing the number of common k-mers between S * and G (k-Mer Scaffold Filling), the second aims at minimizing the number of breakpoints between S * and G (Min-Breakpoint Scaffold Filling). We study these problems from the parameterized complexity point of view, providing fixed-parameter (FPT) algorithms for both problems. In particular, we show that k-Mer Scaffold Filling is FPT wrt. parameter , the number of additional k-mers realized by the completion of S—this answers an open question of Jiang et al. [CPM 2016]. We also show that Min-Breakpoint Scaffold Filling is FPT wrt. a parameter combining the number of missing genes, the number of gene repetitions and the target distance

    Tree Diet: Reducing the Treewidth to Unlock FPT Algorithms in RNA Bioinformatics

    Get PDF
    Hard graph problems are ubiquitous in Bioinformatics, inspiring the design of specialized Fixed-Parameter Tractable algorithms, many of which rely on a combination of tree-decomposition and dynamic programming. The time/space complexities of such approaches hinge critically on low values for the treewidth tw of the input graph. In order to extend their scope of applicability, we introduce the Tree-Diet problem, i.e. the removal of a minimal set of edges such that a given tree-decomposition can be slimmed down to a prescribed treewidth tw\u27. Our rationale is that the time gained thanks to a smaller treewidth in a parameterized algorithm compensates the extra post-processing needed to take deleted edges into account. Our core result is an FPT dynamic programming algorithm for Tree-Diet, using 2^{O(tw)}n time and space. We complement this result with parameterized complexity lower-bounds for stronger variants (e.g., NP-hardness when tw\u27 or tw-tw\u27 is constant). We propose a prototype implementation for our approach which we apply on difficult instances of selected RNA-based problems: RNA design, sequence-structure alignment, and search of pseudoknotted RNAs in genomes, revealing very encouraging results. This work paves the way for a wider adoption of tree-decomposition-based algorithms in Bioinformatics
    • …
    corecore